Refine your search
Collections
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Paramasivam, Ilango
- Anonymization in PPDM based on Data Distributions and Attribute Relations
Abstract Views :196 |
PDF Views:0
Authors
Affiliations
1 School of Advanced Sciences, VIT University, Vellore - 632014, Tamil Nadu, IN
2 School of Computer Science and Engineering, VIT University, Vellore - 632014, Tamil Nadu, IN
1 School of Advanced Sciences, VIT University, Vellore - 632014, Tamil Nadu, IN
2 School of Computer Science and Engineering, VIT University, Vellore - 632014, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 37 (2016), Pagination:Abstract
Objectives: Privacy Preserving Data Mining techniques deal with the secure data publication or communication without revealing the private and sensitive information about any individual. Anonymization technique has been considered as one of the most effective techniques since it can provide better tradeoff between data utility and privacy preservation. Methods/Statistical Analysis: Existing anonymization techniques works on individual attributes and their cardinalities and they do not consider the relations among different attributes of the data. In this paper we have considered auxiliary information and entropy and mutual information to calculate distribution of entities in an attribute and relations among different attributes respectively. Based on these calculations we shall be analyzing the best generalization level for data anonymization. Findings: An adverse user can analyze the data with numerous possible perspectives viz. auxiliary information, theoretical and manual data analysis and try to exploit the data vulnerability, so improved data privacy can be achieved if we could also see with the adversary eyes. Applications/Improvements: Different other techniques can be applied to find distribution and relations on the basis of data background and its area of application.Keywords
Auxiliary Information, Data Anonymization, Entropy, Mutual Information, Privacy Preserving Data Mining (PPDM).- A Study of Impact on Missing Categorical Data - A Qualitative Review
Abstract Views :159 |
PDF Views:0
Authors
Affiliations
1 School of Computer Science and Engineering, VIT University, Vellore - 632014, Tamil Nadu, IN
1 School of Computer Science and Engineering, VIT University, Vellore - 632014, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 32 (2016), Pagination:Abstract
Objectives: In this study, MissForest algorithm is used to analyze the impact of missing data. Statistical Analysis: For categorical dataset, MissForest package is used to impute various missing values and tested with missing data incrementally, first with 5% missing attributes of the records from the original dataset then with 10% and so on. Findings: The Proportion of Falsely Classified (PFC) is computed for categorical dataset. Since the good performance of MissForest leads to a value close to zero and bad performance to a value around one, the performance measured for the Nursery dataset is quite good. Application: This approach has good effect when the ratio of missing data is low. Missforest algorithm can be used for numerical or categorical data. The results are improved for continuous data.Keywords
Categorical Data, Data Mining, Imputation, Missing Data, MissForest.- A Study on Impact of Dimensionality Reduction on Naïve Bayes Classifier
Abstract Views :207 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Bharathiar University, Coimbatore – 641046, Tamilnadu, IN
2 School of Computing Science & Engineering, VIT University, Vellore – 632014, Tamilnadu, IN
1 Department of Computer Science, Bharathiar University, Coimbatore – 641046, Tamilnadu, IN
2 School of Computing Science & Engineering, VIT University, Vellore – 632014, Tamilnadu, IN